Word Formation Is Aware of Morpheme Family Size
نویسندگان
چکیده
منابع مشابه
Word Formation Is Aware of Morpheme Family Size
Words are built from smaller meaning bearing parts, called morphemes. As one word can contain multiple morphemes, one morpheme can be present in different words. The number of distinct words a morpheme can be found in is its family size. Here we used Birth-Death-Innovation Models (BDIMs) to analyze the distribution of morpheme family sizes in English and German vocabulary over the last 200 year...
متن کاملMorpheme-Enhanced Spectral Word Embedding
Traditional word embedding models only learn word-level semantic information from corpus while neglect the valuable semantic information of words’ internal structures such as morphemes. To address this problem, the goal of this paper is to exploit the morphological information to enhance the quality of word embeddings. Based on spectral method, we propose two word embedding models: Morpheme on ...
متن کاملCo-learning of Word Representations and Morpheme Representations
The techniques of using neural networks to learn distributed word representations (i.e., word embeddings) have been used to solve a variety of natural language processing tasks. The recently proposed methods, such as CBOW and Skip-gram, have demonstrated their effectiveness in learning word embeddings based on context information such that the obtained word embeddings can capture both semantic ...
متن کاملCross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling
This paper presents our segmentation system developed for the MLP 2017 shared tasks on cross-lingual word segmentation and morpheme segmentation. We model both word and morpheme segmentation as character-level sequence labelling tasks. The prevalent bidirectional recurrent neural network with conditional random fields as the output interface is adapted as the baseline system, which is further i...
متن کاملMorpheme-Aware Subword Segmentation for Neural Machine Translation
Neural machine translation together with subword segmentation has recently produced state-of-the-art translation performance. The commonly used segmentation algorithm based on byte-pair encoding (BPE) does not consider the morphological structure of words. This occasionally causes misleading segmentation and incorrect translation of rare words. In this thesis we explore the use of morphological...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLoS ONE
سال: 2014
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0093978